An Open Source Urdu Resource Grammar

نویسندگان

  • Shafqat Mumtaz Virk
  • Muhammad Humayoun
  • Aarne Ranta
چکیده

We develop a grammar for Urdu in Grammatical Framework (GF). GF is a programming language for defining multilingual grammar applications. GF resource grammar library currently supports 16 languages. These grammars follow an Interlingua approach and consist of morphology and syntax modules that cover a wide range of features of a language. In this paper we explore different syntactic features of the Urdu language, and show how to fit them in the multilingual framework of GF. We also discuss how we cover some of the distinguishing features of Urdu such as, ergativity in verb agreement (see Sec 4.2). The main purpose of GF resource grammar library is to provide an easy way to write natural language applications without knowing the details of syntax, morphology and lexicon. To demonstrate it, we use Urdu resource grammar to add support for Urdu in the work reported in (Angelov and Ranta, 2010) which is an implementation of Attempto (Attempto 2008) in GF.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Computational Classification of Urdu Dynamic Copula Verb

In this paper, a lexical functional grammar for an automatic classification of Urdu copula verb hO (be/become) is presented according to linguistic theories. A test suite of sentences containing almost all different conjugation forms of copula verb is extracted from a raw corpus. It is tried to keep only the cases of copular construction because the copula verb hO is very much dynamic in nature...

متن کامل

A First Approach Towards an Urdu WordNet

This paper reports on a first experiment with developing a lexical knowledge resource for Urdu on the basis of Hindi WordNet. Due to the structural similarity of Urdu and Hindi, we can focus on overcoming the differences in the scriptual systems of the two languages by using transliterators. Various natural language processing tools, among them a computational semantics based on the Urdu ParGra...

متن کامل

Computational evidence that Hindi and Urdu share a grammar but not the lexicon

Hindi and Urdu share a grammar and a basic vocabulary, but are often mutually unintelligible because they use different words in higher registers and sometimes even in quite ordinary situations. We report computational translation evidence of this unusual relationship (it differs from the usual pattern, that related languages share the advanced vocabulary and differ in the basics). We took a GF...

متن کامل

Implementing Urdu Grammar as Open Source Software

Urdu is a challenging language because of, first, its Perso-Arabic script, second, its morphological system having inherent grammatical forms and vocabulary of Arabic, Persian and the native languages of South Asia and third, its pragmatically neutral constituent order (SOV Subject Object Verb). Today, the state of art technology to write grammars (morphology + syntax) is to use specialpurpose ...

متن کامل

An Open Source Punjabi Resource Grammar

We describe an open source computational grammar for Punjabi; a resource-poor language. The grammar is developed in GF (Grammatical framework), which is a tool for multilingual grammar formalism. First, we explore different syntactic features of Punjabi and then we implement them in accordance with GF grammar requirements, to make Punjabi the 17th language in the GF resource grammar library.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010